The upv-unige-CIAOSENSO WSD system
نویسندگان
چکیده
The CIAOSENSO WSD system is based on Conceptual Density, WordNet Domains and frequences of WordNet senses. This paper describes the upvunige-CIAOSENSO WSD system, we participated in the english all-word task with, and its versions used for the english lexical sample and the WordNet gloss disambiguation tasks. In the last an additional goal was to check if the disambiguation of glosses, that has been performed during our tests on the SemCor corpus, was done properly or not. Introduction The CIAOSENSO WSD system is an unsupervised system based on Conceptual Density (Agirre and Rigau, 1995), frequencies of WordNet senses, and WordNet Domains (Magnini and Cavagli à, 2000). Conceptual Density (CD) is a measure of the correlation among the sense of a given word and its context. The foundation of this measure is the Conceptual Distance, defined as the length of the shortest path which connects two concepts in a hierarchical semantic net. The starting point for our work was the CD formula of Agirre and Rigau (Agirre and Rigau, 1995), which compares areas of subhierarchies. The noun sense disambiguation in the CIAOSENSO WSD system is performed by means of a formula combining Conceptual Density with WordNet sense frequency (Rosso et al., 2003). WordNet Domains is an extension of WordNet 1.6, developed at ITC-irst1, where each synset has been annotated with at least one domain label, selected from a set of about two hundred labels hierarchically organized (Magnini and Cavagli à, 2000). Since the lexical resource used by the upvunige-CIAOSENSO WSD system is WordNet 2.0 (WN2.0), it has been necessary to map the synsets of WordNet Domains from version 1.6 to the version 2.0. This has been done in a fully automated way, by using the WordNet mappings for nouns and Istituto per la Ricerca Scientifica e Tecnologica, Trento, Italy verbs, and by checking the similarity of synset terms and glosses for adjectives and adverbs. Some domains have also been assigned by hand in some cases, when necessary. 1 Noun Sense Disambiguation In our upv-unige-CIAOSENSO WSD system the noun sense disambiguation is carried out by means of the formula presented in (Rosso et al., 2003), which gave good results for the disambiguation of nouns over the SemCor corpus (precision 0.815). This formula has been derived from the original Conceptual Density formula described in (Agirre and Rigau, 1995):
منابع مشابه
UPV-WSD : Combining different WSD Methods by means of Fuzzy Borda Voting
This paper describes the WSD system developed for our participation to the SemEval-1. It combines various methods by means of a fuzzy Borda voting. The fuzzy Borda votecounting scheme is one of the best known methods in the field of collective decision making. In our system the different disambiguation methods are considered as experts that give a preference ranking for the senses a word can be...
متن کاملWSD system based on specialized Hidden Markov Model (upv-shmm-eaw)
We present a supervised approach to Word Sense Disambiguation (WSD) based on Specialized Hidden Markov Models. We used as training data the Semcor corpus and the test data set provided by Senseval 2 competition and as dictionary the Wordnet 1.6. We evaluated our system on the English all-word task of the Senseval-3 competition. 1 Description of the WSD System We consider WSD to be a tagging pro...
متن کاملUniGe at CLEF 2009 Robust WSD Task (abstract only)
For our second participation to the Robust Word Sense Disambiguation (WSD) Task, we focused on performing a deep analysis of the ambiguity issue in the field of Information Retrieval. During the 2008 edition, we noted that although the WSD corpus allowed lifting lexical ambiguities, our results based on the corpus' WSD were not clearly better than those based on words only. We showed that lexic...
متن کاملApplication of Bacillus amyloliquefaciens as probiotic for Litopenaeus vannamei (Boone, 1931) cultivated in a biofloc system
Probiotics can improve growth, survival and resistance to pathogenic organisms of the cultivated species in aquaculture systems with water recirculation. However, their possible benefits on biofloc systems have been less studied. In this study, the benefits of Bacillus amyloliquefaciens bacterium, on a biofloc culture of Litopenaeus vannamei were evaluated. B. amyloliquefaciens was applied as d...
متن کاملمعرفی رویکردی ماشینی با استفاده از الگوریتم لسک و برچسبدهی نحوی جهت رفع ابهام از معنای کلمات
The present study introduces a machine-based approach for word sense disambiguation (WSD). In Persian, a morphologically complex language, POS tag which lots of homographs are made, one way for doing WSD is allocating the right Part Of Speech (POS) tags to words prior to WSD. Since the frequency of noun and adjective homographs in different Persian POS tag text corpuses is high, POS tag disambi...
متن کامل